Search results for "Audio signal processing"

showing 10 items of 18 documents

Real-time signal processing in embedded systems

2016

International audience

010302 applied physics[ INFO ] Computer Science [cs]business.industryComputer science020206 networking & telecommunications02 engineering and technologycomputer.software_genre01 natural sciencesSignalHardware and Architecture0103 physical sciences0202 electrical engineering electronic engineering information engineeringReal time signal processing[INFO]Computer Science [cs]businessAudio signal processingcomputerSoftwareDigital signal processingComputer hardwareComputingMilieux_MISCELLANEOUS

researchProduct

On the Use of a GPU-Accelerated Mobile Device Processor for Sound Source Localization

2017

Abstract The growing interest to incorporate new features into mobile devices has increased the number of signal processing applications running over processors designed for mobile computing. A challenging signal processing field is acoustic source localization, which is attractive for applications such as automatic camera steering systems, human-machine interfaces, video gaming or audio surveillance. In this context, the emergence of systems-on-chip (SoC) that contain a small graphics accelerator (or GPU), contributes a notable increment of the computational capacity while partially retaining the appealing low-power consumption of embedded systems. This is the case, for example, of the Sam…

020203 distributed computingSignal processingbusiness.industryComputer scienceReal-time computingMobile computing020206 networking & telecommunicationsContext (language use)02 engineering and technologyAcoustic source localizationcomputer.software_genreField (computer science)Power (physics)0202 electrical engineering electronic engineering information engineeringGeneral Earth and Planetary SciencesbusinessAudio signal processingMobile devicecomputerComputer hardwareGeneral Environmental ScienceProcedia Computer Science

researchProduct

2015

Visuo-auditory sensory substitution systems are augmented reality devices that translate a video stream into an audio stream in order to help the blind in daily tasks requiring visuo-spatial information. In this work, we present both a new mobile device and a transcoding method specifically designed to sonify moving objects. Frame differencing is used to extract spatial features from the video stream and two-dimensional spatial information is converted into audio cues using pitch, interaural time difference and interaural level difference. Using numerical methods, we attempt to reconstruct visuo-spatial information based on audio signals generated from various video stimuli. We show that de…

Audio signalComputer Networks and Communicationsbusiness.industryComputer scienceSpeech recognitionMotion detectionTranscodingAudio signal flowVideo processingcomputer.software_genreSensory substitutionArtificial IntelligenceHardware and ArchitectureSonificationComputer visionArtificial intelligencebusinessAudio signal processingcomputerSoftwareInformation SystemsFrontiers in ICT

researchProduct

On the Design of Probe Signals in Wireless Acoustic Sensor Networks Self-Positioning Algorithms

2018

A wireless acoustic sensor network comprises a distributed group of devices equipped with audio transducers. Typically, these devices can interoperate with each other using wireless links and perform collaborative audio signal processing. Ranging and self-positioning of the network nodes are examples of tasks that can be carried out collaboratively using acoustic signals. However, the environmental conditions can distort the emitted signals and complicate the ranging process. In this context, the selection of proper acoustic signals can facilitate the attainment of this goal and improve the localization accuracy. This letter deals with the design and evaluation of acoustic probe signals all…

Audio signalComputer sciencebusiness.industryApplied Mathematics020208 electrical & electronic engineeringReal-time computingBandwidth (signal processing)020206 networking & telecommunicationsRanging02 engineering and technologycomputer.software_genreTransducerSignal Processing0202 electrical engineering electronic engineering information engineeringChirpWirelessElectrical and Electronic EngineeringAudio signal processingbusinessFrequency modulationcomputerIEEE Signal Processing Letters

researchProduct

Video preprocessing for audiovisual indexing

2003

We address the problem of detecting shots of subjects that are interviewed in news sequences. This is useful since usually these kinds of scenes contain important and reusable information that can be used for other news programs. In a previous paper, we presented a technique based on a priori knowledge of the editing techniques used in news sequences which allowed a fast search of news stories (see Albiol, A. et al., 3rd Int. Conf. on Audio and Video-based Biometric Person Authentication, p.366-71, 2001). We now present a new shot descriptor technique which improves the previous search results by using a simple, yet efficient, algorithm, based on the information contained in consecutive fra…

AuthenticationSequenceInformation retrievalContextual image classificationBiometricsComputer scienceSpeech recognitionSearch engine indexingcomputer.software_genreObject detectionReduction (complexity)Face (geometry)PreprocessorAudio signal processingcomputerImage retrievalIEEE International Conference on Acoustics Speech and Signal Processing

researchProduct

Decoding Children's Social Behavior

2013

We introduce a new problem domain for activity recognition: the analysis of children's social and communicative behaviors based on video and audio data. We specifically target interactions between children aged 1-2 years and an adult. Such interactions arise naturally in the diagnosis and treatment of developmental disorders such as autism. We introduce a new publicly-available dataset containing over 160 sessions of a 3-5 minute child-adult interaction. In each session, the adult examiner followed a semi-structured play interaction protocol which was designed to elicit a broad range of social behaviors. We identify the key technical challenges in analyzing these behaviors, and describe met…

Behavior Psychology Dataset Video analysis Speech Analysis AutismInter-action protocolsSocial and communicative behaviorInteraction protocol02 engineering and technologycomputer.software_genreAnnan data- och informationsvetenskapSession (web analytics)Activity recognitionTechnical challenges0202 electrical engineering electronic engineering information engineeringmedicineSocial behaviorAudio signal processingMultimediabusiness.industryDevelopmental disorders020207 software engineeringmedicine.diseaseSemi-structuredResearch questionsActivity recognitionProblem domainKey (cryptography)Autism020201 artificial intelligence & image processingArtificial intelligencePsychologybusinessOther Computer and Information SciencecomputerCognitive psychologySocial behavior2013 IEEE Conference on Computer Vision and Pattern Recognition

researchProduct

The neural basis of sublexical speech and corresponding nonspeech processing: a combined EEG-MEG study.

2014

Abstract We addressed the neural organization of speech versus nonspeech sound processing by investigating preattentive cortical auditory processing of changes in five features of a consonant–vowel syllable (consonant, vowel, sound duration, frequency, and intensity) and their acoustically matched nonspeech counterparts in a simultaneous EEG–MEG recording of mismatch negativity (MMN/MMNm). Overall, speech–sound processing was enhanced compared to nonspeech sound processing. This effect was strongest for changes which affect word meaning (consonant, vowel, and vowel duration) in the left and for the vowel identity change in the right hemisphere also. Furthermore, in the right hemisphere, spe…

ConsonantAdultMaleLinguistics and LanguageMemory Long-TermCognitive NeuroscienceSpeech recognitionMismatch negativityExperimental and Cognitive PsychologyAuditory cortexcomputer.software_genreLanguage and LinguisticsLateralization of brain functionFunctional LateralitySpeech and HearingYoung AdultDiscrimination PsychologicalPhoneticsReference ValuesVowelReaction TimeHumansAudio signal processingAuditory CortexCommunicationAnalysis of VarianceDuplex perceptionbusiness.industryMagnetoencephalographyElectroencephalographyMagnetic Resonance ImagingSemanticsAuditory PerceptionEvoked Potentials AuditorySpeech PerceptionSyllablebusinessPsychologycomputerBrain and language

researchProduct

The indexing of persons in news sequences using audio-visual data

2004

We describe a video indexing system that automatically searches for a specific person in a news sequence. The proposed approach combines audio and video confidence values extracted from speaker and face recognition analysis. The system also incorporates a shot selection module that seeks for anchors, where the person on the scene is likely speaking. The system has been extensively tested on several news sequences with very good recognition rates.

Contextual image classificationComputer scienceSpeech recognitionSearch engine indexingComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONSelection (linguistics)Speaker recognitionAudio signal processingcomputer.software_genrecomputerFacial recognition systemElectronic mail2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).

researchProduct

Capturing and Indexing Rehearsals: The Design and Usage of a Digital Archive of Performing Arts

2015

International audience; Preserving the cultural heritage of the performing arts raises difficult and sensitive issues, as each performance is unique by nature and the juxtaposition between the performers and the audience cannot be easily recorded. In this paper, we report on an experimental research project to preserve another aspect of the performing arts—the history of their rehearsals. We have specifically designed non-intrusive video recording and on-site documentation techniques to make this process transparent to the creative crew, and have developed a complete workflow to publish the recorded video data and their corresponding meta-data online as Open Data using state-of-the-art audi…

Digital archivingComputer science[ INFO.INFO-WB ] Computer Science [cs]/Web02 engineering and technology[ INFO.INFO-CV ] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]computer.software_genre[SHS.MUSEO]Humanities and Social Sciences/Cultural heritage and museologyvideo processingWorld Wide WebDocumentationopera11. Sustainability0202 electrical engineering electronic engineering information engineeringAudio signal processing[ INFO.INFO-MM ] Computer Science [cs]/Multimedia [cs.MM]HypervideoMultimediahypervideo[INFO.INFO-WB]Computer Science [cs]/Web[INFO.INFO-MM]Computer Science [cs]/Multimedia [cs.MM][INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]020207 software engineering[ MATH.MATH-NA ] Mathematics [math]/Numerical Analysis [math.NA]Video processingLinked dataperforming artsaudio processingCultural heritageWorkflowtheaterLinked Data[ SHS.MUSEO ] Humanities and Social Sciences/Cultural heritage and museology020201 artificial intelligence & image processingPerforming artscomputer[MATH.MATH-NA]Mathematics [math]/Numerical Analysis [math.NA]

researchProduct

Environment Sound Classification using Multiple Feature Channels and Attention based Deep Convolutional Neural Network

2020

In this paper, we propose a model for the Environment Sound Classification Task (ESC) that consists of multiple feature channels given as input to a Deep Convolutional Neural Network (CNN) with Attention mechanism. The novelty of the paper lies in using multiple feature channels consisting of Mel-Frequency Cepstral Coefficients (MFCC), Gammatone Frequency Cepstral Coefficients (GFCC), the Constant Q-transform (CQT) and Chromagram. Such multiple features have never been used before for signal or audio processing. And, we employ a deeper CNN (DCNN) compared to previous models, consisting of spatially separable convolutions working on time and feature domain separately. Alongside, we use atten…

FOS: Computer and information sciencesComputer Science - Machine LearningSound (cs.SD)Computer science020209 energyMachine Learning (stat.ML)02 engineering and technologycomputer.software_genreConvolutional neural networkComputer Science - SoundDomain (software engineering)Machine Learning (cs.LG)Statistics - Machine LearningAudio and Speech Processing (eess.AS)0202 electrical engineering electronic engineering information engineeringFOS: Electrical engineering electronic engineering information engineeringAudio signal processingVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550business.industrySIGNAL (programming language)Pattern recognitionFeature (computer vision)Benchmark (computing)020201 artificial intelligence & image processingArtificial intelligenceMel-frequency cepstrumbusinesscomputerElectrical Engineering and Systems Science - Audio and Speech ProcessingCommunication channel

researchProduct